Evaluation of term utility functions for very short multidocument summaries
نویسندگان
چکیده
We describe results from an application for relevance assessment in a setting related to multi-document summarization. For the task of characterizing given document collections by a short list of relevant terms, we have proposed the term utility function PxR. The measure is competitive to a variety of utility functions commonly used in text mining. Our function incorporates a user-definable parameter which allows for explicit, continuous trade-off between precision and recall, which was preferred by our users over the more opaque term utility functions from text mining. The Fβ measure is similar but not identical to our measure and will also be discussed. Despite our users’ preference for a user-definable parameter, the improvement by setting different user-defined parameter values for each document collection are limited, and a static value for the parameter works almost as well. This seems to be true for the Fβ measure as well. A simple measure, SR, also performs competitively. In light of this evidence, a user-definable parameter seems to be unnecessary to achieve competitive performance.
منابع مشابه
Machine and Human Performance for Single and Multidocument Summarization
coherency—and be able to draw the “best” information from a set of documents. Automatic single-document text summarization1 has been an active research area since the 1950s, with a renaissance of approaches since the 1990s. Human single-document summarization is well defined when guidelines and recommendations drive performance.2,3 System-generated single-document summaries, while not always ma...
متن کاملExperiments with CST-Based Multidocument Summarization
Recently, with the huge amount of growing information in the web and the little available time to read and process all this information, automatic summaries have become very important resources. In this work, we evaluate deep content selection methods for multidocument summarization based on the CST model (Cross-document Structure Theory). Our methods consider summarization preferences and focu...
متن کاملA Cosine Maximization-Minimization approach for User-Oriented Multi-Document Update Summarization
This paper presents a User-Oriented MultiDocument Update Summarization system based on a maximization-minimization approach. Our system relies on two main concepts. The first one is the cross summaries sentence redundancy removal which tempt to limit the redundancy of information between the update summary and the previous ones. The second concept is the newness of information detection in a cl...
متن کاملAutomatic multidocument summarization of research abstracts: Design and user evaluation
The purpose of this study was to develop a method for automatic construction of multi-document summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts a...
متن کاملSub-Event Based Multi-Document Summarization
The production of accurate and complete multiple-document summaries is challenged by the complexity of judging the usefulness of information to the user. Our aim is to determine whether identifying sub-events in a news topic could help us capture essential information to produce better summaries. We used six methods to create multi-document summaries and then compared them to find which method ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Applied Artificial Intelligence
دوره 20 شماره
صفحات -
تاریخ انتشار 2006